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Abstract 


This  paper  presents  a  special  purpose  dual  linear  programming  algorithm 
to  solve  linear  least  absolute  value  problems.  In  addition,  strategies 
involving  start  procedures  are  examined.  Implementations  of  computer- 
based  techniques  are  discussed.  Computational  results  with  three  computer 
code  versions  of  the  algorithm  are  given. 


1.  Introduction 


The  least  absolute  value  criterion  has  been  considered  as  an 
alternative  method  to  least  squares  in  fitting  a  linear  model. 
Least  absolute  value  estimation  yields  the  unknown  parameters  in  a 
stochastic  model  as  to  minimize  the  sum  of  the  absolute  deviations 
of  a  given  set  of  observations  from  the  values  predicted  by  the 
model . 


The  problem  examined  here  can  be  stated  as  follows:  Given  a 
set  of  n  observational  measurements  (Y^,  x.^,  x^>  •••»  xim)»  1*1  • 
2,  ....  n,  determine  the  value  for  6  =  ( 3-j  962*  ...  ,Bm)T  which  will 
minimize  £  |Y,  -  x„B,  -  x12b2  -  ...  x^ej  . 

P  I 


ID 


It  has  been  noted  by  Glahe  and  Hunt  [8]  that  the  least  absolute 
value  criterion  has,  in  particular,  interested  econometricians  who 
frequently  estimate  parameters  of  linear  models  with  relatively  small 
number  of  observations.  Another  advantage  of  the  least  absolute  value 
estimator  is  the  resistance  to  outliers  in  the  data  or  to  heavy-tailed 
error  distributions  (see,  for  example,  Rice  and  White  [10].)  A  major 
difference  between  the  least  squares  and  the  least  absolute  value 
estimate  is  the  fact  that  the  least  squares  criterion  always  produces 
an  unique  optimum  in  the  objective  function  while  the  least  absolute 
value  criterion  at  times  can  have  multiple  optimal  solutions.  Since 
most  of  the  data  used  in  real  world  applications  of  curve  fitting 
problems  are  far  from  accurate,  like  in  econometric  and  business  research 
works,  a  method  that  brings  out  the  nonuniqueness  of  the  estimates  might 
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at  tunes  be  more  preferable.  However,  an  important  handicap  in  t ho  least 
absolute  valuable  criterion  is  that  the  estimates  are  governed  by  a 
restricted  set  of  observations,  those  that  are  inside  the  convex  hull  of 
all  observations. 

Recently,  there  has  been  a  steadily  increasing  interest  in  least 
absolute  value  estimates  and  their  properties  due  to  statisticians'  aware 
ness  of  the  limitations  of  least  squares  analysis,  as  well  as  the  develop 
ment  of  computationally  efficient  algorithms  for  providing  the  least 
absolute  value  estimates.  Charnes,  Cooper  and  Ferguson  [6 J  appear  to  be 
first  to  have  demonstrated  that  linear  least  absolute  value  problems  can 
be  rewritten  as  linear  programming  problems.  Employing  their  result  here 
problem  (1)  is  equivalent  to: 
n 

Minimize  £  (P-  +  N.) 

i=l  1  1 

subject  to 
m 

l  xiiB.  +  P<  "  Ni  =  Y,,  i  =  1,2 . n, 

j_,  'j  j 

Pi  *  0,  Ni  *  0,  i  =  1,2 . n, 

where  Pi  and  l^i  are,  respectively,  the  positive  and  negative  deviation 
associated  with  the  i-th  observation. 

Barrodale  and  Young  [4],  Usow  £  1 43  *  Robers  and  Ben-Israel  [11], 
Abdelmalek  [1],  Schlossmacher  [12],  Spyropoulos,  Kiountouzis  and  Young 
[13],  Barrodale  and  Roberts  [3],  and  Armstrong,  Frome  and  Kung  [2] 
present  special  purpose  primal  algorithms  to  solve  (2).  The  algorithm 


(2) 


given  here  is  a  special  purpose  dual  algorithm  to  solve  this  problem. 


2.  Algorithm 

The  dual  problem  of  (2)  is: 
n 


Maximize 

i 

subject  to 

n 

5:  u.X.. 

i=l  1 

=  o. 

j  =  1. 

2 »  • » • i  m. 

(3) 

"i 

s  1, 

i  =  1, 

2 )  •••$ 

%  - 

-  1, 

i  =  1, 

2$  n . 

Assume  the  basis  matrix,  B  of  dimension  m  by  m,  to  be  of  full  rank. 
Rank  deficiencies  can  easily  be  handled  within  the  linear  programming 
framework  (see  Ben-Israel  and  Charnes  [5]).  Define  IB  to  be  the  index 
set  of  the  basic  variables,  and  the  index  sets  NL  and  NU  to  be  indicators 
for  the  nonbasic  variables  which  are  respectively  at  their  lower  and 
upper  bounds.  Define  to  be  the  vector  of  the  basic  variables. 

The  initial  basic  feasible  solution  are  given  as  follows.  All 
nonbasic  variables  are  set  to  their  upper  bound  value,  namely,  +  1. 

The  values  of  the  basic  variables  are: 

:B  * 

where  hI  *  r.  n.x...  j  =  1,  2,  ....  m. 

Bj  IeNU  1 


.Jd 


If  nB  satisfies  -Is  n^s+l,  i  t.  IB,  the  current  basis  B  is  feasible 
and  the  algorithm  will  proceed  directly  to  phase  2  of  the  simplex  method. 
Otherwise,  is  infeasible  and  a  phase  1  procedure  is  required  to 
produce  a  feasible  solution. 

2.1  Phase  1 

Define  cg  to  be  the  basic  cost  vector  in  the  phase  1  process.  The 
values  of  <~B  are  determined  as  follows: 

r 

■V  J  0  if  -1  *  V*  +u  j  e  IB 

CD  =  I  “  J 

j  -sign  (nD  )  otherwise,  j  e  IB 

L  j 

The  termination  criterion  tor  this  process  is  that  all  values  of  cg  are 
equal  to  zero. 

If  the  termination  criterion  is  not  satisfied,  the  algorithm  then 
determines  a  nonbasic  variable,  ir$ >  s  e  (NL  U  NU)  to  enter  the  basis.  To 
accomplish  this,  the  algorithm  calculates  the  reduced  costs  of  the  nonbasic 
variables,  zfc,  k  c  (NL  U  NU|  as  it  is  done  in  the  standard  linear  program¬ 
ming  technique.  The  reduced  costs  are: 

Zk  •  CB  B-'  xk-  k  '  u  NU> 

where  Xk  is  the  k-th  row  of  the  observational  matrix,  X.  of  dimension 
n  by  m. 


The  candidates  for  the  entering  variable  satisfy  the  following 


relations: 


Zj(<  0  for  k  i.  HL 
z^>  0  for  k  t.  NU. 


I  he  procedure  for  choosing  the  vector  to  enter  the  basis  consists  of 
selecting  the  maximum  of  the  absolute  values  of  z^.  This  will  not,  in 
general,  give  the  largest  improvement  in  the  objective  function  value, 
but  does  give  the  fastest  change  in  the  objective  per  unit  change  of  the 
incoming  variable. 


If  the  nonbasic  variable  is  considered  to  be  brought  into  the 
basis,  the  algorithm  then  calculates  the  amount  of  change  required  by  the 
entering  variable  to  force  the  feasibility  of  the  leaving  variable.  The 
value  of  the  change,  9,  is  obtained  from  finding  the  minimum  of  the 
following: 


0  =  min< 


l-p*j,B 
_ J 


V  +p*j 

J  J 


for  c'„  =  0,  j 

3  Bj 


for  p€.>0,  7*0,  j 

3  bj 


~  I »  2y  •••> 


where  = 


B.j  (xsT  xs2’ 


SJ  •  j 


=  1,2,  . . . ,  m, 


1,  2,  m. 


sign  Uj),  j  =  1 ,  2,  ... ,  m; 
sign  (z$). 


i 
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If  0  =  2,  ns  will  remain  as  nonbasic  but  will  switch  to  its  opposite 
bound  value.  Furthermore,  the  values  of  the  basic  variables  n 


Bj ,  j  =  1,  2, 


....  m  will  be  updated  as  follows: 


+  2p4.,  j  =  1 ,  2,  . . . ,  in. 
j  J 


If  0  f  2  and  the  minimum  ratio  value  comes  from  the  r-th  basic  variable, 

the  value  of  |z$|  will  be  decreased  by  £r.  If  this  updated  value,  z,  where 

z  =  | z  |  -  f,r,  remains  positive,  no  pivoting  is  performed.  Rather,  the 

value  of  cB  will  equal  zero.  The  algorithm  then  calculates  0  from  (5) 

*  r 

with  Cg  =  0,  and  evaluates  the  basic  variable  to  be  considered  to  leave  the 

basis.  On  the  other  hand,  if  z  is  negative,  the  values  of  the  basic 

variables  tTb  ,  j  =  1 ,  2,  . . . ,  m  become: 

J 


"r  +  p9r,i  for 

j  j  J 


\  *  "s  (1  -  9) 


The  algorithm  then  performs  the  pivoting  procedure  of  the  simplex 
method  in  which  the  nonbasic  enters  the  basis  and  the  basic  variable  Hg 
leaves  the  basis. 

After  the  updating  process  is  completed,  the  algorithm  checks  the 
feasibility  level  of  the  basic  variables  and  continues  with  the  above 
iterative  procedure  until  -1  5  s  +1 ,  i  e  IB,  is  satisfied.  It 
then  proceeds  to  the  phase  2  procedure. 


L 
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2.2  Phase  2 

The  optimal  conditions  are  characterized  by  the  following: 


zk  -  V 

0 

for 

k  c 

NL 

2k  -  V 

0 

for 

k  f. 

NU 

where  zk  -  Yk  =  YR  B"1  Xk  -  Yk>  Y^  is  the  basic  cost  vector. 

If  the  conditions  in  (6)  are  not  satisfied,  the  algorithm  then 
determines  the  most  violating  reduced  cost  by  finding  the  maximum  of  the 
absolute  values  of  the  reduced  costs.  If  |zs  -  Y$j  is  the  maximum,  this 
means  that  the  nonbasic  variable  it  is  considered  to  enter  the  basis. 


The  algorithm  then  finds  the  basic  variable  to  be  examined  by 
calculating  the  minimum  of  the  following: 


r 

*  2; 

for  f  0,  j  =  1 ,  2,  . . . ,  m 


If  0  =  2,  tt$  will  remain  as  a  nonbasic  variable  but  will  switch  to 

its  opposite  bound  value.  If  0  f  2,  and  the  minimum  ratio  corresponds 

to  the  r-th  basic  variable,  it  will  enter  the  basis  to  replace  nD  ,  and 

s  Br 

the  pivoting  procedure  of  the  simplex  method  will  be  carried  out.  In 

any  event,  the  values  of  the  basic  variables  will  be  updated: 


IT 


B 


r 


+  P0fM 
J 

+  2p£r 


ll  -  0) 


for  all  jfr 
for  0=2 
for  0^2 


The  algorithm  repeats  the  above  procedure  until  conditions  (6)  are 
satisfied. 

3.  Computational  Experience 

The  dual  algorithm  was  coded  in  FORTRAN  IV  and  tested  on  the  CDC  6600 
computer  at  The  University  of  Texas  at  Austin.  The  observations  were 
drawn  from  various  uniform  and  normal  distributions  using  a  random  number 
generator.  Three  versions,  called  PROGRAM  I,  PROGRAM  II,  and  PROGRAM  III 
were  implemented. 

PROGRAM  I  is  the  version  of  the  algorithm  where  initially  all  the 
nonbasic  variables  are  set  to  their  upper  bound  value,  namely,  +1.  All 
the  computations  follow  the  scheme  of  the  algorithm  as  described  earlier. 

PROGRAM  II  initially  sets  the  values  of  the  nonbasic  variables  based 
on  the  signs  of  the  reduced  costs.  The  process  is  given: 

tt1  =  sign  (Yi  -  Yg  B"1  X^,  i  e  NB. 

Additional ly,  a  chaining  technique  is  implemented  to  choose  the  entering 
variable.  Instead  of  selecting  the  maximum  of  the  absolute  value  of  the 
reduced  costs,  this  technique  considers  the  first  eligible  candidate, 
say  rr s ,  which  satisfies  the  conditions  in  (4)  to  become  basic.  Note 
that  in  this  example  the  first  chain  starts  from  the  first  position  and 
ends  in  the  s-th  position  of  the  list  of  the  reduced  costs.  The  pivoting 
process  of  the  algorithm  will  then  take  place.  Two  situations  may  arise: 


(i)  If  -I  ^  it-  s  +1,  i  e  IB,  is  satisfied,  the  algorithm  will 
either  proceed  to  pfase  2  if  the  process  is  currently  in 
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the  phase  1  procedure  or  terminate  if  it  is  in  the  phase  2 
procedure. 

(ii)  However,  if  -1  <  <  +1,  i  t.  IB,  is  not  satisfied,  the 

procedure  continues  to  search  for  the  next  eligible  nonbasic 
variable  by  utilizing  a  pointer  to  indicate  where  the  previous 
chain  ends  and  evaluates  the  reduced  costs  based  on  the  new 
basis  starting  from  the  (s  +  1 )th  position  until  a  nonbasic 
variable  which  satisfies  the  conditions  in  (4)  is  found. 

This  nonbasic  variable  will  then  enter  the  basis  and  the 
process  continues  until  all  the  basic  variables  are  feasible. 

An  advantage  of  this  chaining  technique  is  that  the  time-consuming 
sorting  process  of  the  maximum  of  the  absolute  reduced  costs  is  eliminated. 
However,  in  some  situations  when  a  large  number  of  chains  has  to  be 
developed,  much  access  time  is  consumed. 

PROGRAM  III  is  a  version  of  PROGRAM  II  where  a  candidate  list  replaces 
the  chaining  process  in  the  phase  2  procedure.  The  phase  1  procedure  still 
maintains  the  chaining  technique.  The  length  of  the  candidate  list  can  be 
easily  assigned  by  the  user.  Suppose  the  conditions  in  (6)  are  not 
satisfied  and  the  length  of  the  candidate  list  is  assigned  to  be  5,  the 
candidates  in  this  list  will  be  stored  in  terms  of  the  indices  of  the 
first  five  nonbasic  variables  which  satisfy  the  conditions  in  (4).  Moreover, 
the  candidates  are  sorted  in  descending  order  based  on  the  absolute  values 
of  their  reduced  costs.  The  process  will  then  choose  the  first  candidate 
from  the  list  to  become  basic.  I  he  pivoting  process  of  the  algorithm  will 
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be  carried  out.  If  the  conditions  in  (6)  are  satisfied,  the  phase  2 
procedure  will  be  terminated  and  the  current  solution  will  be  optimal. 

On  the  other  hand,  if  the  conditions  in  (6)  are  violated,  the  process 
examines  the  next  candidate  on  the  list  by  calculating  its  reduced  cost 
from  the  new  basis.  If  the  reduced  cost  satisfies  (4),  this  candidate 
will  be  brought  into  the  basis.  Otherwise,  the  process  proceeds  to  examine 
the  remaining  candidates  on  the  list.  In  the  event  when  all  the  candidates 
on  the  list  have  been  examined,  a  new  list  with  length  of  five  will  be 
formulated.  This  process  continues  until  the  conditions  in  (6)  are  satisfied. 

ihe  candidate  list  structure  provides  an  alternative  to  the  chaining 
technique  described  earlier. 

The  computational  results  given  in  Table  1  are  mean  times  and  iteration 
counts  which  include  the  number  of  basis  changes  for  a  set  of  10  problems 
with  the  same  characteristics. 


Number  of 
Parameters 

Number  of 
Observations 

PROGRAM  I 

PROGRAM  II 

PROGRAM  III 

10 

50 

223 

158 

154 

78 

59 

86 

L=5 

10 

100 

408 

331 

595 

132 

94 

248 

L=5 

10 

200 

1514 

1225 

924 

297 

215 

381 

L=4 

20 

200 

3084 

z96y 

2702 

334 

269 

569 

L=4 

30 

200 

7648 

5275 

6655 

435 

315 

854 

'  - -  , 

L=6 

The  upper  number  in  each  row  is  the  mean  time  in  CPU  milliseconds;  the 
lower  number  is  the  mean  number  of  iterations  required;  and  L  is  the 
length  of  the  candidate  list. 


Table  1 .  Computational  Testings  of  PROGRAM  1,  PROGRAM  II  and 

program  iii. 


4.  Conclusion 


This  fru-pcr  presents  a  dual  algorithm  to  solve  linear  least  absolute 
value  approximations.  The  algorithm  utilizes  the  concept  of  linear  program 
ming  methodology.  Strategies  Involving  start  procedures  are  examined.  The 
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techniques  of  chaining  and  candidate  list  structure  are  implemented.  From 
the  computational  testings,  these  techniques  enhance  the  efficiency  of  the 
algorithm  in  terms  of  computer  time. 
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to  solve  linear  least  absolute  value  problems.  In  addition,  strategies 
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versions  of  the  algorithm  ure  given. ± 
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